Introduction
The diagnosis of myeloproliferative neoplasms (MPNs) often involves complex and nuanced interpretations of hematologic data, which can lead to diagnostic ambiguity. Machine learning (ML) models offer a promising solution to enhance diagnostic accuracy and efficiency. By leveraging large datasets and sophisticated algorithms, ML can assist in differentiating between various subtypes of MPNs and other related hematologic conditions, thereby reducing uncertainty and improving patient outcomes.
Methods
A systematic review was conducted to identify studies utilizing ML models for the diagnosis of MPNs. The search strategy included a comprehensive query of PubMed and other relevant databases, adhering to PRISMA guidelines. Studies were selected based on their use of ML techniques to analyze clinical and laboratory data for MPN diagnosis. The inclusion criteria required studies to report performance metrics such as accuracy, sensitivity, specificity, and area under the curve (AUC).
Results
We identified 11 studies that used various ML models for MPN diagnosis. Below is a summary of the top five models based on their performance and innovative approaches:
1. Sirinukunwattana et al.
Outcome: Automated analysis of megakaryocytes to categorize MPNs and distinguish them from reactive bone marrow samples.
Advantages: Provides a quick assessment of sequential bone marrow samples and a comprehensive overview of megakaryocytic cells.
Limitations: Requires additional parameters like marrow cellularity, lineage maturation, fibrosis degree, and blast cell estimation for complete MPN classification.
Model: Unsupervised Learning - Principal Component Analysis (PCA).
Uses: Reduces high-dimensional data and enables exploratory data analysis.
2. Kimura et al.
Outcome: Automated diagnostic support system for MPNs using peripheral blood specimens.
Advantages: Provides fast assessment and accurate differentiation of polycythemia vera, essential thrombocythemia, and myelofibrosis.
Limitations: Single-center study with a limited number of cases.
Model: Deep Learning - Convolutional Neural Network (CNN).
Uses: Performs image recognition and classification, learning features automatically from raw data.
3. Asaulenko et al.
Outcome: Differentiation between essential thrombocythemia and primary myelofibrosis based on histotopographical features of megakaryocytes.
Advantages: Reveals patterns of megakaryocyte distribution in bone marrow of patients with JAK2/CALR mutations.
Limitations: The correct diagnostic prediction rate for primary myelofibrosis was only 40%.
Model: Unsupervised Learning - Density-Based Spatial Clustering of Applications with Noise (DBSCAN).
Uses: Clusters and detects anomalies in high-dimensional data.
4. Kantardzic et al.
Outcome: Extraction of new decision rules for diagnosing polycythemia vera using a reduced and optimized set of lab parameters.
Advantages: Reduces diagnostic parameters to four while maintaining good classification results.
Limitations: Not diagnostic on its own; complements standard PVSG criteria.
Model: Supervised Learning - Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs).
Uses: Handles classification and regression tasks, requiring large amounts of labeled data for training.
5. Shen et al.
Outcome: Prediction of advanced MPNs using progressive platelet transcriptomic markers.
Advantages: Provides a comprehensive catalog of platelet transcriptome in chronic MPNs and accurate prediction of myelofibrosis using fewer than five candidate markers.
Limitations: Only analyzes platelet-derived molecular alterations, requiring further biological and computational validations.
Model: Supervised Learning - Multiple LASSO (Least Absolute Shrinkage and Selection Operator) penalized regression classifiers.
Uses: Suitable for classification and prediction tasks, especially when there are more features than observations.
Conclusion
This review highlights the potential of various ML models to improve the diagnosis of MPNs. The top models demonstrated high accuracy and innovative approaches to data analysis. However, limitations such as small sample sizes and the need for further validation remain. Future research should focus on standardizing these models and integrating them into clinical practice to enhance diagnostic precision and patient care.
No relevant conflicts of interest to declare.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal